[codex] Preserve managed endpoint failure diagnostics#3366
[codex] Preserve managed endpoint failure diagnostics#3366juliusmarminge wants to merge 4 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
ApprovabilityVerdict: Needs human review This PR modifies error handling behavior across multiple code paths in the managed endpoint runtime, changing what diagnostic information is logged and introducing new error types. While the changes appear security-positive (avoiding sensitive data in logs), the substantive modifications to error handling infrastructure warrant human review. No code changes detected at You can customize Macroscope's approvability policy. Learn more. |
Dismissing prior approval to re-evaluate c9180e8
c9180e8 to
2f9cc13
Compare
Dismissing prior approval to re-evaluate 2f9cc13
2f9cc13 to
b7a1982
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Mixed causes drop failure diagnostics
- Changed the interrupt-only check from
interruptionReasons.length > 0tointerruptionReasons.length === cause.reasons.lengthso logging is only suppressed for purely-interrupt causes, and added a mixed-cause branch that logs diagnostics before re-propagating interrupts.
- Changed the interrupt-only check from
Or push these changes by commenting:
@cursor push 42a4fe013d
Preview (42a4fe013d)
diff --git a/apps/server/src/cloud/ManagedEndpointRuntime.ts b/apps/server/src/cloud/ManagedEndpointRuntime.ts
--- a/apps/server/src/cloud/ManagedEndpointRuntime.ts
+++ b/apps/server/src/cloud/ManagedEndpointRuntime.ts
@@ -88,13 +88,19 @@
attributes: Readonly<Record<string, unknown>>,
) {
const interruptionReasons = cause.reasons.filter(Cause.isInterruptReason);
- if (interruptionReasons.length > 0) {
+ if (interruptionReasons.length > 0 && interruptionReasons.length === cause.reasons.length) {
return Effect.failCause(Cause.fromReasons<never>(interruptionReasons));
}
- return Effect.logWarning(message, {
+ const log = Effect.logWarning(message, {
...attributes,
...managedEndpointCauseDiagnostics(cause),
});
+ if (interruptionReasons.length > 0) {
+ return log.pipe(
+ Effect.andThen(Effect.failCause(Cause.fromReasons<never>(interruptionReasons))),
+ );
+ }
+ return log;
}
function platformErrorDiagnostics(error: PlatformError.PlatformError) {You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit b7a1982. Configure here.
a42406d to
e1eb3fb
Compare
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
e1eb3fb to
96f192e
Compare


Summary
Effect.catchTagsfor tagged platform/config alternatives and keep user-facing spawn failures independent of internal causesVerification
vp test apps/server/src/cloud/ManagedEndpointRuntime.test.ts(13 passed)vp check(passes with 20 pre-existing warnings)vp run typecheckOverlap audit
No active PR touches the three changed files (
gh pr list --state open --limit 1000).Note
Medium Risk
Changes cloud relay connector supervision, restart triggers on probe failure, and persisted config bootstrap—important for tunnel availability but scoped to managed endpoint runtime logging and reconciliation.
Overview
Cloud managed endpoint runtime logging and error handling are tightened so operators get useful structure without echoing secrets or nested failure text.
Logging now uses bounded diagnostic summaries (
managedEndpointCauseDiagnostics,platformErrorDiagnostics) instead of rawcauseobjects or relay line content—transport warnings logoutputLengthonly, not redacted or full output. Pure scope interruptions are re-propagated vialogManagedEndpointCauseand are not logged as supervisor/observer failures.Config load wraps secret read/decode in tagged
CloudManagedEndpointRuntimeConfigReadError/DecodeError(exact.causeretained on the error class); startup logs onlyerrorTag,resource, andcauseTag.decodeRuntimeConfigswitches fromdecodeUnknownOptiontodecodeUnknownEffect.Runtime behavior: failed
isRunningprobes log a structured warning and treat the connector as dead (restart on next reconcile); spawnPlatformErroryields a fixed user-facing reason (Failed to start the relay client.) while logs stay tag/module/method-only.Reviewed by Cursor Bugbot for commit 96f192e. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Preserve structured failure diagnostics in managed endpoint logging
CloudManagedEndpointRuntimewith bounded diagnostic summaries (tag counts, module/method fields) via a newmanagedEndpointCauseDiagnosticshelper.logManagedEndpointCauseto handle mixed failure/interruption causes: pure interruptions are never logged, mixed causes are logged with diagnostics while the interrupt cause is still propagated.platformErrorDiagnosticsfor consistent flat diagnostic objects fromPlatformErrorwithout nested cause strings.CloudManagedEndpointRuntimeConfigReadErrorandCloudManagedEndpointRuntimeConfigDecodeErrortagged error classes;readRuntimeConfiganddecodeRuntimeConfignow surface distinct typed failures instead of returningOption.'Failed to start the relay client.'with no nested error detail;isRunningprobe failures withPlatformErrornow trigger a restart instead of being treated as not-running silently.Macroscope summarized 96f192e.